Object Oriented Programming

The kind of object-oriented programming most people are familiar with is usually in the style of Java. It's unfortunate because the original concept of object-oriented programming didn't work that way — it's really just a subset of functional programming.

Erlang style

History

It's important to know a bit about the history of Erlang in order to understand why some of the design decisions were made. Erlang was designed at Ericsson for their work on telephony switches.

The best example is the often reported nine nines (99.9999999%) of availability offered on the Ericsson AXD 301 ATM switches, consisting of over 1 million lines of Erlang code.

- Learn You Some Erlang

Erlang The Movie - Hello Mike

All of it's features were designed with practicality in mind, since they were dealing with real world problems: transparent distribution (horizontal and vertical scaling), code hot-swapping, asynchronicity, soft realtime, granular supervision, fault tolerance, load balancing, live debugging, etc.

Erlang is viewed as a functional language because all data structures are immutable (immutability greatly simplifies concurrent programming), pattern matches on messages, and has first class functions. But it's actually multi-paradigm. Some argue that the real benefit of object oriented programming was the message-passing/actor paradigm that came from Smalltalk and Erlang.

Actors

Actors are the Erlang style of object oriented programming. They are treated as black boxes — no actor can mess with the internals of another actor. Instead actors communicate with other actors asynchronously via message passing.

The Erlang engineers came to a lot of great insights about distributed programming, but two really stick out: transparent distribution and fault tolerance.

Transparent Distribution

What they realized was that processes are distributed along multiple machines in large-scale real world applications. Even if they are co-located on the same machine, often times they are basically a black box because it's closed source, or essentially closed source because it's unknown code, or it might migrate to another machine eventually.

This is where the idea of transparent distribution comes from. They realized that they could greatly simplify scalability this way, as it doesn't matter if an actor is on your machine or on a remote machine, or if it's a small thing, or if it grew into a large thing.

This is where the concept of encapsulation in object oriented programming came from. The contents are kept hidden because you can't see those things, they're on another machine! and even if you could see it today, you might not be able to see it tomorrow as things scale.

This is also why actors communicate asynchronously via message passing as well — you might be tempted to do things syncronously when everything is on the same machine today because latency is virtually non-existent, but tomorrow that might not be the case. It's bad news if you need to redesign everything as things start to scale.

Fault Tolerance

Another major insight the Erlang engineers had was that in computing, especially distributed computing, things fail. Types won't help you at all if the hardware fails, or someone trips on a wire, or a solar storm happens, or a hurricane — you get the idea.

Erlang's approach to fault tolerance is inspired by Common Lisp's conditions and restarts system (I'll cover Lisp in another blog post). Erlang errors are propagated across actors as signals, and these signals can be actual UNIX signals like SIGINT and SIGTERM. Some of these signals won't kill an actor, they can be safely ignored, while some of them such as exit signals will. This is just like Common Lisp conditions.

In order to prevent an actor from getting killed by an exit signal, the exits (errors) need to be trapped (caught). Trapping converts the signal into an actual message that the actor can handle. These are like restarts in Common Lisp.

Learn You Some Erlang - It's a trap:

1> process_flag(trap_exit, true).
true
2> spawn_link(fun() -> linkmon:chain(3) end).
<0.49.0>
3> receive X -> X end.
{'EXIT',<0.49.0>,"chain dies here"}

It's very important to understand that this is not like exception handling in more mainstream languages like Java. When an exception is caught in Java, the block continues off at a lower point in the call stack, so exception handling is mostly there just to gracefully die or to completely restart a part of the program from scratch.

Restarts in Common Lisp and Erlang are different. They allow you to restart at the highest point possible of the program and resume with all of the state still intact! In Common Lisp, it resumes at the stack frame and place where it errored with all the state still there. In Erlang, restarts are resumed at the highest point of the program, as a message to be read by the mailbox, and also with all the state because it's treated like a regular message by the actor.

Erlang Processes

There is another feature of Erlang, fast restarts — when things go down, they should be quick to spin back up for maximum reliability. Erlang's virtual machine, BEAM, has a construct specially designed for that: erlang processes.

These aren't like OS processes or threads, but more like something in between. You can't share state between them except via message passing, and they have unique identifiers that can span across machines — sort of like a super OS process, but they're like threads in that they're lightweight and really fast to spin up.

Supervisors

The Erlang engineers realized that a lot of common patterns with similar solutions keep coming up, so they created a set of libraries collectively called OTP to work well with a set of given common patterns called behaviors (which are similar to interfaces).

They also realized that while trapping signals is a great technique, sometimes processes fail for unforseen reasons, or they're better off failing entirely and restarted from scratch. Supervisors come from OTP, and they're meant to address this. A supervisor actor can supervise one or more child actors and perform some action(s) if they fail. Stacking supervisors on top of each other (a supervisor supervising supervisors, etc) creates a supervision tree. This is how Erlang creates highly fault tolerant systems.

Learn You Some Erlang - Supervisors:

-module(band_supervisor).
-behaviour(supervisor).

-export([start_link/1]).
-export([init/1]).

start_link(Type) ->
  supervisor:start_link({local,?MODULE}, ?MODULE, Type).

%% The band supervisor will allow its band members to make a few
%% mistakes before shutting down all operations, based on what
%% mood he's in. A lenient supervisor will tolerate more mistakes
%% than an angry supervisor, who'll tolerate more than a
%% complete jerk supervisor
init(lenient) ->
  init({one_for_one, 3, 60});
init(angry) ->
  init({rest_for_one, 2, 60});
init(jerk) ->
  init({one_for_all, 1, 60});

Ocaml style

Ocaml has mutable objects similar to Java except that mutation is more principled and explicit. And most Ocaml engineers use modules instead, which are like a more powerful expressive objects, to codify similar patterns.

Mutable Objects

Mutable values in Ocaml objects have to be defined explicitly using the mutable keyword:

let stack init = object
  val mutable value = init

  method pop =
    match value with
    | hd :: tl ->
      value <- tl;
      Some hd
    | [] -> None

  method push hd =
    value <- hd :: value
end

Mutability is also more disciplined in Ocaml — the mutable value in the above example can't spill out of the object, it can only live inside the object instance. This is like Clojure's transients but for objects.

Refs

Another similar construct is refs. This is also more principled and disciplined than mutation in Java because ref is a boxed type — an int ref, for example, can't be passed to a function that expects int, it has to be dereferenced first. This way mutation doesn't accidentally spill out into the rest of your program, it's only where it's supposed to be:

let hi: string ref = ref "hello"

let f (x: string) = print_endline x

(* This won't compile:

let _ = f hi

  It'll error out with:

  This has type:
    string ref
  But somewhere wanted:
    string
*)

(* Dereferencing hi with an ! will take out the value *)
let _ = f !hi

Type signatures are optional in Ocaml, they're only annotated in the example for clarity.

Immutable Objects

Objects don't need to be mutable, even in Java. However, without the right language constructs it'll feel really awkward:

let immutable_stack init = object
  val value = init

  method pop =
    match value with
    | hd :: tl -> Some (hd, {< value = tl >})
    | [] -> None

  method push hd =
    {< value = hd :: value >}
end

The {< ... >} is syntax that produces a copy of the current object with the specified fields updated. Immutable objects are interesting but rarely used because Ocaml has such a powerful module system.

Module system

What it's not for

You can implement the same mutable stack example from earlier, but this isn't really what modules are good at:

module type INIT = sig
  type 'a t and a
  val value : a t
end

module type STACK = sig
  type 'a t and a
  val pop : unit -> a option
  val push : a -> unit
end

module Stack (Init: INIT with type 'a t = 'a list):
  STACK with type a = Init.a and type 'a t = 'a Init.t = struct
  type 'a t = 'a Init.t and a = Init.a
  let value = ref Init.value
  let pop () =
    match !value with
    | hd :: _ -> Some hd
    | [] -> None
  let push hd =
    value := hd :: !value
end

It's possible just more verbose. There are some advantages where modules allow you to express more complex polymorphic mutable containers that regular objects can't do, but these are advanced situations and rare. If you need to implement a stack you're better off using objects.

For code organization

The module system is very useful for code organization:

module Swatch = struct
  module Color = struct
    type foreground = [ `white | `lightgrey ]
    type background = [ `black | `darkgray ]
  end

  module Size = struct
    let small = `px 12
    let medium = `px 16
    let large = `px 24
  end

  module Font = struct
    type value = [ `fira_sans | `fira_code ]
    type t = [ `font of value ]
    let main = `font `fira_sans
    let code = `font `fira_code
  end
end

(* stubbed function that uses a font *)
let f (_: Swatch.Font.t) = ()
let _ = f Swatch.Font.code

It's also useful for situations where a module naturally subdivides into multiple sections, but the sections aren't important enough to get their own file (yet). This way you can grow your app in a very natural way. You can nest modules as much as you want.

Generic functions

This is another great use case for modules: implementing generic functions that operate on any type as long as they implement a specific interface.

In this example there's a module type (basically an interface) called APPENDABLE. Any type that has an empty value and a way to append values together can implement that interface. For example, strings have an empty value of "", and append would concatenate strings together. And integers have an empty value of 0 and append would add integers together.

Here I'm defining a Collector module, that defines a collect function. That function will take a list of anything that's APPENDABLE and accumulate the values together:

module type APPENDABLE = sig
  type t
  val empty : t
  val append : t -> t -> t
end

module Collector (T: APPENDABLE) = struct
  let collect (x: T.t list) =
    x |> List.fold_left T.append T.empty
end

module StringCollector = Collector (struct
  type t = string
  let empty = ""
  let append a b = a ^ b
end)

module IntCollector = Collector (struct
  type t = int
  let empty = 0
  let append a b = a + b
end)

(* prints: "foobarbaz" *)
let _ =
  ["foo"; "bar"; "baz"]
  |> StringCollector.collect
  |> print_endline

(* prints: 10 *)
let _ =
  [1; 2; 3; 4]
  |> IntCollector.collect
  |> string_of_int
  |> print_endline

Note the syntax on Collector. This is called a functor, which is like a module-function — it takes one or more modules (in this case one) and returns a module.

Flexible strongly-typed containers

Ocaml modules are first-class, which means you can pass them around in functions. First class modules pass around type information, so you can have a container that normally would only take a homogenous type, like an array of strings or a list of integers, can have it take in different types:

module type V = sig
  type t
  val x : t
  val to_string : t -> string
end

(* A collection of heterogeneous modules! *)
let abc: (module V) list =
  [
    (module struct
      type t = int
      let x = 123
      let to_string = string_of_int
    end);
    (module struct
      type t = string
      let x = "hi"
      let to_string x = x
    end);
    (module struct
      type t = bool
      let x = true
      let to_string = string_of_bool
    end)
  ]

(* Iterate over them and print the string value *)
let _ =
  abc
  |> List.iter @@ fun item ->
       let module M = (val item : V) in
       print_endline @@ M.to_string M.x

You might wonder how that's possible since Ocaml is strongly typed. Ocaml only supports upcasting, so the type information in the module becomes the abstract type t unless you explicitly use with_type (the polymorphic type becomes a rigid type to preserve type safety), which is why a to_string function is provided.

You can store the type information in a variant though if you need to. Or if it's a finite set you can provide extra type info and store it in a tuple:

let xyz =
  (
    (module struct
      type t = int
      let x = 123
      let to_string = string_of_int
    end : V with type t = int)
  , (module struct
      type t = string
      let x = "hi"
      let to_string x = x
    end : V with type t = string)
  , (module struct
      type t = bool
      let x = true
      let to_string = string_of_bool
    end : V with type t = bool)
  )
(* val xyz : (module V with type t = int)
           * (module V with type t = string)
           * (module V with type t = bool) *)

First class modules or functors?

You might wonder why you would use a functor over a first-class module, since both can do the same thing of taking modules as arguments and returning modules. There are specific cases where the Ocaml compiler is stricter with first class modules. It will throw an error that a value will escape it's scope. Otherwise, first class modules might be more expressive. This is the Collector example from earlier with first class modules:

module type EMPTY = sig
  type t
  val empty : t
end

module type APPENDABLE = sig
  type t
  val empty : t
  val append : t -> t -> t
end

module type APPENDABLE_V = sig
  include APPENDABLE
  val value : t
end

let empty (type a) empty: (module EMPTY with type t = a) =
  (module struct
    type t = a
    let empty = empty
  end)

let appendable (type a) empty append (value: a):
  (module APPENDABLE_V with type t = a) =
  let module Empty = (val empty: EMPTY with type t = a) in
  (module struct
    include Empty
    let append = append
    let value = value
  end)

let collect (type a) empty x: a =
  let module E = (val empty: EMPTY with type t = a) in
  x
  |> List.fold_left (fun acc item -> begin
       let module I = (val item: APPENDABLE_V with type t = a) in
       I.append acc I.value
     end)
     E.empty

module Empty = struct
  let string = empty ""
  and int = empty 0
  and bool = empty true
end

let string = appendable Empty.string (^)
let int = appendable Empty.int (+)
let bool = appendable Empty.bool (||)

let x: string =
  ["foo"; "bar"; "baz"]
  |> List.map string
  |> collect Empty.string
(* prints: "foobarbaz" *)

let y: int =
  [123; 456; 789]
  |> List.map int
  |> collect Empty.int
(* prints: 1368 *)

let z: bool =
  [true; false; true]
  |> List.map bool
  |> collect Empty.bool
(* prints: true *)